Computing Inter-Rater Reliability With the SAS System
نویسنده
چکیده
The SAS system V.8 implements the computation of unweighted and weighted kappa statistics as an option in the FREQ procedure. A major limitation of this implementation is that the kappa statistic can only be evaluated when the number of raters is limited to 2. Extensions to the case of multiple raters due to Fleiss (1971) have not been implemented in the SAS system. A SAS macro called MAGREE.SAS, that can handle the case of multiple raters is available at the SAS Institute’s web site (check it at http://ftp.sas.com/techsup/download/stat/magree.sas). In this article, we discuss about the use of the SAS system to compute Kappa statistics in general. We will also present our SAS macro called INTER RATER.MAC that can handle multiple raters and can compute AC1 and Kappa statistics (overall and for each category) as well as the associated standard errors. In the INTER RATER.MAC SAS macro, the AC1 standard error is calculated both conditionally on the sample od raters as well as unconditionally. The unconditional standard error has the particular feature of taking into account the additional variability that is due to the sampling of raters.
منابع مشابه
Using The SAS ® System To Examine The Effect Of Augmentation On The Inter - Rater Reliability Of Holistic Ratings
A two-stage process by which an holistic rubric is applied to the assessment of open-ended items, such as writing samples, is defined. The first stage involves scoring a performance by the assignment of an integer rating that is congruent with the proficiency level that is exhibited in the performance. The second stage is the subsequent assignment by the rater of an augmentation that indicates ...
متن کاملEvaluation of Spasticity Using the Ashworth Scale with Intermediate Scores (ASIS)
Objectives: The main purpose of this research was to study and contribute to an accurate test of spastic limb. The intra, inter rater reliability of the test was examined. Methods: The present study was carried out in two parts In the first part of the study, the modified Ashworth Scale with Intermediate Scores (ASIS) was studied. During the second part of the study the intra, inter rater re...
متن کاملReliability of Body Landmarks Analyzer for Measuring the Quadriceps Angle
Genovarum and Genovalgum are the most common postural deformities of the knee joint. A quadriceps angle is used to measure these anomalies. Methods of measuring this angle are divided into two categories: invasive and non-invasive. The purpose of the present research was to study the inter/intra rater reliability of the non-invasive Body Landmarks Analyzer method for measuring of the quadriceps...
متن کاملAre You in Need of Validation? Psychometric Evaluation of Questionnaires Using SAS
Presentations at prior SAS user group meetings have focused on factor analysis and related topics in order to develop the scale structure of a questionnaire. Instead, this presentation will assume that the scale has already been developed but needs validation as a new scale or for use in a new population. The examples are taken from healthrelated quality-of-life research and will follow the “Gu...
متن کاملFunctional Movement Screen in Elite Boy Basketball Players: A Reliability Study
Purpose: To investigate the reliability of Functional Movement Screen (FMS) in basketball players. A few studies have compared the reliability of FMS between raters with different experience in athletes. The purpose of this study was to compare the FMS scoring between the beginners and expert raters using video records. Methods: This is a cross-sectional study. The study subjects compris...
متن کامل